Diel-Niche-Vignette

Introducion

The \(\texttt{Diel.Niche}\) package is used to evaluate how animals use the three fundamental periods of light: twilight (dawn & dusk), daytime, and nighttime. First, install and load the library

# Install package from GitHub via devtools.
# devtools::install_github("diel-project/Diel-Niche-Modeling",ref="main")

# Load the pacakge
  library(Diel.Niche)

The packages specifies many diel hypotheses, which together are defined into five hypothesis sets. The code names for the hypotheses and sets are,

what.hyp()
#> Traditional :  D N CR C  
#> 
#>  General :  D N CR C2 D.CR D.N CR.N 
#>  
#>  Threshold :  D.th N.th CR.th C.th 
#>  
#>  Maximizing:  D.max N.max CR.max 
#>  
#>  Variation:  D.var N.var CR.var C.var 
#>  
#>  Selection:  D.avail TW.avail N.avail EQ.avail D.TW.avail N.TW.avail D.N.avail 
#> 

A short description of each hypothesis is available by calling,

  what.hyp("CR")
#> [1] "Traditional/General Crepuscular"

To list the hypotheses of a given set,

  hyp.sets("Traditional")
#> [1] "D"  "N"  "CR" "C"

  hyp.sets("General")
#> [1] "D"    "N"    "CR"   "C2"   "D.CR" "D.N"  "CR.N"

The best way to understand each hypothesis set is by plotting hypotheses from a set. For the four fundamental sets:

Maximizing

This hypothesis set includes three hypotheses (Crepuscular Max, Diurnal Max, and Nocturnal Max) with the objective to evaluate which time period is used most. As such, there is no hypothesis about activity across multiple time periods (i.e., cathemeral).

triplot(hyp=hyp.sets("Maximizing"))

Traditional

This hypothesis set includes four hypotheses (Crepuscular, Diurnal, Nocturnal, Traditional Cathemeral) that aim to capture the general interpretation of these hypotheses from the literature. Crepuscular, Diurnal, and Nocturnal are defined based on having at least 0.80 probability (threshold probability, \(\xi_{1} = 0.80)\) in their respective diel periods (twilight, daytime, nighttime). If an animal is not mostly active in one period than it is defined as Traditional Cathemeral; this occurs when either two or three time periods are used more than \(1-\xi_{1}\). The logic behind the threshold of 0.80 is that an animal uses a clear majority of time in one diel period, but is not so strict that there is not some moderate amount of activity outside of this period. While this is the default value we set for the traditional hypothesis set in \(\texttt{Diel.Niche}\), it can easily be modified by a user when fitting these hypotheses to their data (see below).

triplot(hyp=hyp.sets("Traditional"))

General

This hypothesis set includes seven hypotheses. The Diurnal, Crepuscular, and Nocturnal hypotheses are defined the same as in Traditional. The main difference is the separation of the probability space of Traditional Cathemeral into four more specific hypotheses: General Cathemeral, Crepuscular-Nocturnal, Diurnal-Nocturnal, and Diurnal-Crepuscular. The General Cathemeral hypothesis—which represents a subset of the parameter space taken by the previously mentioned Traditional Cathemeral—aims to define when an animal uses all three diel periods (twilight, daytime, nighttime) at equal to or more than a minimum amount (i.e., \(p_{\text{tw}}, p_{\text{d}}, p_{\text{n}} \leq \xi_{1}\) and \(p_{\text{tw}}, p_{\text{d}}, p_{\text{n}} \geq \xi_{2}\)). We defined the lower threshold probability as \(\xi_{2} = 0.10\), such that we consider it important to differentiate animal activity when a diel period is used at least this much. However, if only two diel periods are used above \(\xi_{2}\) then we classify this activity using one of the binomial hypotheses (Crepuscular-Nocturnal, Diurnal-Nocturnal, and Diurnal-Crepuscular).

For an example, suppose a species is active mainly during the day with \(p_{\text{d}} = 0.78\), but is also relatively active during twilight with \(p_{\text{tw}} = 0.16\), and not very active at night with \(p_{\text{n}} = 0.06\). We would define this activity as Diurnal-Crepuscular because \(p_{\text{tw}}\) & \(p_{\text{d}} \geq \xi_{2}\), while \(p_{\text{n}} < \xi_{2}\). However, if night activity was also higher than \(\xi_{2}\), such that \(p_{\text{d}} = 0.70\), and \(p_{\text{tw}} = p_{\text{n}} = 0.15\) then we classify this activity as General Cathemeral because a moderate to large amount of activity is occuring in all three diel periods. In summary, the Traditional hypothesis set distinguishes between unimodal and multi-modal diel activity, while the General set distinguishes among unimodal, bimodal, and trimodal activity.

triplot(hyp=hyp.sets("General"))

We can also view specific hypotheses, as

triplot(hyp="D",more.points = TRUE)

Selection

This hypothesis set includes seven diel phenotype hypotheses and are defined based on an inputted amount of proportional time available to an animal in each diel period (\(\mathbf{p}_{\text{avail}} = [p_{av.tw},\;p_{av.d}]\)). The available time in the night period does not need to be input as it is derived as \(1 - p_{av.tw}, - p_{av.d}]\). The amount of time available in each diel period changes based on the day of the year and location.

First, let’s plot where day and night have equal availability as,

triplot(hyp=hyp.sets("Selection"),diel.setup = diel.ineq(p.avail=c(0.04,0.48)))

We can see the hypotheses change when we make an increasing amount of available time in the night period, such as during winter in the higher latitude areas,

triplot(hyp=hyp.sets("Selection"),diel.setup = diel.ineq(p.avail=c(0.04,0.3)))

Simulating Data

To simulate data based on a diel hypothesis, we need define a hypothesis using its coded name. Pick a hypothesis to simulate data from and how many samples.

  set.seed(45451)
  hyp=c("CR")
  sim=sim.diel(hyp=hyp,n.sample=100)
  
  #The probability value used to simulate data
  p=sim$p
  p
#>       [,1]  [,2] [,3]
#> [1,] 0.855 0.025 0.12
  
  #The simulated data
  y=sim$y
  y
#>      y_crep y_day y_night
#> [1,]     89     1      10

Setup hypothesis set and fit models

You can setup your own model set by creating a vector of coded hypothesis names or use the pre-defined model sets.

#Manual 
  hyp.set = c("D","CR") 
  
#Pre-defined
  hyp.set = hyp.sets("Traditional")

To fit the data to the hypothesis set,

  out = diel.fit(y,hyp.set)
#> Data checks Complete.
#> Calculating Bayes Factors...
#> The most supported model is: 
#>  Crepuscular (Traditional)

#Call the model probabilities for each hypothesis in the set
  out$bf.table
#>    Prior   Posterior
#> D   0.25 0.000000000
#> N   0.25 0.000000000
#> CR  0.25 0.998817404
#> C   0.25 0.001182596

To examine the available outputs from the fitted model object,

attributes(out)
#> $names
#>  [1] "bf.table"           "post.samp"          "ms.model"          
#>  [4] "ppc"                "ms.ppc"             "post.samp.ms.model"
#>  [7] "y"                  "y.vec"              "gelm.diag"         
#> [10] "ms.gelm.diag"       "bf.list"            "diel.setup"        
#> [13] "hyp.set"           
#> 
#> $class
#> [1] "list" "diel"

?diel.fit
#> starting httpd help server ... done

The function ‘diel.fit’ defaults to providing the model probabilities for each hypothesis set, but not the posterior samples of the parameters for each hypothesis. We can change this as,

  out = diel.fit(y,hyp.set,n.chains=2,post.fit = TRUE)
#> Data checks Complete.
#> Calculating Bayes Factors...
#> Posterior Sampling...
#> The most supported model is: 
#>  Crepuscular (Traditional)

We can look at a convergence criteria for each hypothesis. Note that convergence can be very poor for models that do not meet the inequality constraints well.

  out$gelm.diag
#> $D
#> Potential scale reduction factors:
#> 
#>      Point est. Upper C.I.
#> p1_1          1          1
#> p1_2          1          1
#> 
#> 
#> $N
#> Potential scale reduction factors:
#> 
#>      Point est. Upper C.I.
#> p1_1        NaN        NaN
#> p1_2          1          1
#> 
#> 
#> $CR
#> Potential scale reduction factors:
#> 
#>      Point est. Upper C.I.
#> p1_1          1       1.01
#> p1_2          1       1.01
#> 
#> 
#> $C
#> Potential scale reduction factors:
#> 
#>      Point est. Upper C.I.
#> p1_1          1       1.01
#> p1_2          1       1.00

We can plot the posterior samples from the most supported model to check convergence/mixing as,

 plot(coda::as.mcmc(out$post.samp.ms.model))

The posterior samples for all hypotheses are available in a list.

 names(out$post.samp)
#> [1] "D"  "N"  "CR" "C"

#For each of these list is a list of chains

length(out$post.samp[[1]])
#> [1] 2

#Here are the means of the posterior samples of all hypotheses for chain 1
lapply(out$post.samp,FUN=function(x){colMeans(x[[1]])})
#> $D
#>   p_crep_1    p_day_1  p_night_1 
#> 0.17652542 0.80194137 0.02153321 
#> 
#> $N
#>  p_crep_1   p_day_1 p_night_1 
#> 0.0000000 0.1051498 0.8948502 
#> 
#> $CR
#>   p_crep_1    p_day_1  p_night_1 
#> 0.87434009 0.01916783 0.10649208 
#> 
#> $C
#>   p_crep_1    p_day_1  p_night_1 
#> 0.78415074 0.03339213 0.18245713

Plotting

Using the packages bayesplot and ggplot2, we can examine our posterior distributions along with the true probabilities values,

library(ggplot2)
library(bayesplot)
#> This is bayesplot version 1.10.0
#> - Online documentation and vignettes at mc-stan.org/bayesplot
#> - bayesplot theme set to bayesplot::theme_default()
#>    * Does _not_ affect other ggplot2 plots
#>    * See ?bayesplot_theme_set for details on theme setting

posteriors=coda::as.mcmc(out$post.samp.ms.model)

plot_title <- ggtitle("Posterior distributions",
                      "with medians and 80% intervals")
mcmc_areas(posteriors, prob = 0.8) + plot_title+ 
  geom_vline(xintercept=p[1], linetype="dashed",color = c("red"), size=1)+
  geom_vline(xintercept=p[2], linetype="dashed",color = c("purple"), size=1)+
  geom_vline(xintercept=p[3], linetype="dashed",color = c("green"), size=1)
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.

Another way to examine our hypothesis is to plot the theoretical niche space for the hypothesis along with the posterior samples from the most supported model.

triplot(hyp=out$ms.model,more.points = TRUE,
          posteriors=out$post.samp.ms.model)

Or, we can plot the whole hypothesis set

triplot(out)

The Unconstrained Model

If you are interested in comparing the constrained hypotheses to an unconstrained model, you can do that by adding it into the hypothesis set

hyp.set.new=c(hyp.sets("General"),"Uncon")
out = diel.fit(y,hyp.set=hyp.set.new,n.chains=2,post.fit = FALSE)
#> Data checks Complete.
#> Calculating Bayes Factors...
#> The most supported model is: 
#>  Crepuscular (Traditional)

#Compare the uncosntrained model to the rest of the models of the General hypothesis set
out$bf.table
#>       Prior    Posterior
#> D     0.125 0.0000000000
#> N     0.125 0.0000000000
#> CR    0.125 0.9610935074
#> C2    0.125 0.0000226832
#> D.CR  0.125 0.0000000000
#> D.N   0.125 0.0000000000
#> CR.N  0.125 0.0056302348
#> Uncon 0.125 0.0332535745

Data availabilty and an unequal prior

The package includes a set of available data that can be accessed as

head(diel.data)
#>           scientificName twilight day night trap_nights nsite  min_date
#> 1          Canis latrans        6  14    20         733   131 6/13/2013
#> 2   Didelphis virginiana       21   2    86         733   131 6/13/2013
#> 3      Mephitis mephitis        5   2    24         733   131 6/13/2013
#> 4 Odocoileus virginianus        3  76    29         733   131 6/13/2013
#> 5          Procyon lotor       42  45   201         733   131 6/13/2013
#> 6   Sciurus carolinensis        0 198     4         733   131 6/13/2013
#>   max_date mean_lat mean_lon season       country   phylum    class
#> 1 7/9/2013 41.87236 -87.8423 Summer United States Chordata Mammalia
#> 2 7/9/2013 41.87236 -87.8423 Summer United States Chordata Mammalia
#> 3 7/9/2013 41.87236 -87.8423 Summer United States Chordata Mammalia
#> 4 7/9/2013 41.87236 -87.8423 Summer United States Chordata Mammalia
#> 5 7/9/2013 41.87236 -87.8423 Summer United States Chordata Mammalia
#> 6 7/9/2013 41.87236 -87.8423 Summer United States Chordata Mammalia
#>             order      family             Project unit_type
#> 1       Carnivora     Canidae UWIN_Chicago_IL_USA     28day
#> 2 Didelphimorphia Didelphidae UWIN_Chicago_IL_USA     28day
#> 3       Carnivora  Mephitidae UWIN_Chicago_IL_USA     28day
#> 4    Artiodactyla    Cervidae UWIN_Chicago_IL_USA     28day
#> 5       Carnivora Procyonidae UWIN_Chicago_IL_USA     28day
#> 6        Rodentia   Sciuridae UWIN_Chicago_IL_USA     28day
#>             Common_name Activity_Literature
#> 1                Coyote          Cathemeral
#> 2      Virginia Opossum           Nocturnal
#> 3         Striped Skunk           Nocturnal
#> 4     White-tailed Deer         Crepuscular
#> 5      Northern Raccoon           Nocturnal
#> 6 Eastern Gray Squirrel             Diurnal
?diel.data

Here, we will use one of the data sets to fit a model with 1) equal weight on each hypotheses and 2) prior weight that is higher on the literature definition.

# Virginia Opossum data from the Urban Wildlife Information Network
dat=diel.data[2,]

dat
#>         scientificName twilight day night trap_nights nsite  min_date max_date
#> 2 Didelphis virginiana       21   2    86         733   131 6/13/2013 7/9/2013
#>   mean_lat mean_lon season       country   phylum    class           order
#> 2 41.87236 -87.8423 Summer United States Chordata Mammalia Didelphimorphia
#>        family             Project unit_type      Common_name
#> 2 Didelphidae UWIN_Chicago_IL_USA     28day Virginia Opossum
#>   Activity_Literature
#> 2           Nocturnal

y=data.frame(twilight=dat$twilight,
             day=dat$day, 
             night=dat$night)
rownames(y)=dat$Common_name

y
#>                  twilight day night
#> Virginia Opossum       21   2    86

# Model Comparison using equal eights
out1 = diel.fit(as.matrix(y),hyp.set=hyp.sets("Traditional"),post.fit = FALSE)
#> Data checks Complete.
#> Calculating Bayes Factors...
#> The most supported model is: 
#>  Nocturnal (Traditional)

# Model Comparison using 0.8 probability weigh on the nocturnal hypothesis
out2 = diel.fit(as.matrix(y),hyp.set=hyp.sets("Traditional"),post.fit = FALSE,
                prior=c(0.2/3,0.8,0.2/3,0.2/3))
#> Data checks Complete.
#> Calculating Bayes Factors...
#> The most supported model is: 
#>  Nocturnal (Traditional)

# Results out1
round(out1$bf.table,digits=2)
#>    Prior Posterior
#> D   0.25      0.00
#> N   0.25      0.88
#> CR  0.25      0.00
#> C   0.25      0.12

# Results out 2
round(out2$bf.table,digits=2)
#>    Prior Posterior
#> D   0.07      0.00
#> N   0.80      0.99
#> CR  0.07      0.00
#> C   0.07      0.01

We see that the model weight on the nocturnal hypothesis increase because of our prior weight.

Let’s consider another case where we simulate 10 data sets with differing sample sizes to see how the prior weight affects the model weights. Specifically, we will put a high weight (0.95) on the Cathemeral Traditional hypothesis while the simulated data is from the Nocturnal hypothesis.

set.seed(451)
n=matrix(seq(10,200, 20))
sim.dat=t(apply(n,1,FUN=function(x){sim.diel(n.sim=1, n.sample=x, hyp = "N")$y}))


prior.func=function(x,prior){out= diel.fit(t(as.matrix(x)),hyp.set=hyp.sets("Traditional"),
                                    post.fit = FALSE, print=FALSE,
                                    prior=prior)
                                    out$bf.table[,2]
}

#Equal priors
results.prior1=apply(sim.dat,1,prior=c(0.25,0.25,0.25,0.25),
                     FUN=prior.func)

# Unequal priors
results.prior2=apply(sim.dat,1,prior=c(0.05/3,0.05/3,0.05/3,0.95),
                     FUN=prior.func)

#par(mfrow=c(2,1))
matplot(t(results.prior1),type="l",lwd=3,ylab="Model Probability",xaxt="n")
axis(1,at=1:length(n),lab=n)
legend("right", legend=rownames(results.prior2),col = 1:4,lwd=3,lty = 1:4)


matplot(t(results.prior2),type="l",lwd=3,ylab="Model Probability",xaxt="n")
axis(1,at=1:length(n),lab=n)
legend("top", legend=rownames(results.prior2),col = 1:4,lwd=3,lty = 1:4)

The differences in the model probabilieis are entirely due to the prior weights on each hypothesis. In the second plot, the likelihood of the data is trying to overwhelm such a highly certain prior.

Plotting Issues in RStudio

Plotting is done using the package plotly. Plotly can have issues with RStudio. If you are using RStudio and no figures are opening then: Tools–>Global Options–>Advanced–>Rendering Engine Choose “Desktop OpenGL{} and then restart RStudio.